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For many years, computerised concordancing has been the domain of computational 
linguists, corpus linguists, lexicographers and dictionary compilers, working with large 
corpora of millions of words. The use of small-corpora concordancing in ESL settings 
is a relatively new application and has sparked keen interest among many researchers 
and teachers since the mid-80s. This paper discusses the use of small-corpora 
concordancing in the three domains of ESL: 1. syllabus design and evaluation, 2. 
classroom teaching, and 3. test construction. In particular, the classroom concordancing 
approach as an evolving ESL methodology is discussed with reference to its rationale, 
its potentials, its current applications and its impact. The paper concludes with some 
critical comments on what has been achieved so far with small-corpora concordancing 
and points out some directions for the future. 



Introduction 

This paper reviews from the perspective of ESL teaching and learning the roles and applications 
of small-corpora concordancing (SCC), focusing mainly on SCC for the classroom, and touches briefly, 
as far as present literature shows, on SCC in syllabus design and evaluation, and in test design. 

What is concordancing? 

The term concordancing originates from what has been known as concordances. The COBUILD 
dictionary defines a concordance as "an alphabetical list of the words in a book or a set of books which 
also says where each word can be found and often how it is used". Tribble (1990a) refers to a 
concordance as "a reference work designed to assist in the exegesis of biblical and other socially valued 
text". Tribble and Jones (1990) point out that concordances have been produced since the Middle Ages 
on popular works of well-known writers, such as the works of Shakespeare, and most of these have been 
undertaken manually, and, as one can imagine, painstakingly. 

With the advent of the computer, concordances can be generated with the "speed and reliability" 
(Tribble and Jones, 1990) that perhaps manual concordancing could never match. As Sinclair (1991) puts 
it : 

Thirty years ago, ...it was considered impossible to process texts of several million 
words in length. Twenty years ago it was considered quite possible but lunatic. Ten years 
ago it was considered quite possible but still lunatic. Today it is very popular, (p. I) 

Tribble (1990b) also remarks, "the effort involved in such a task [concordancing], when taken 
manually, was intimidatingly large, and ...was more than most individuals would ever want to take on." 
This could be true even with any text more than a few hundred words, let alone texts measuring up (o 
the millions. As Foulds (1991) observes, "the time required to do such a thing [text processing! on a 
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regular basis for texts more than a few hundred words long would have been so great as to render the 
value, if there were any, totally uneconomic". 

So the whole idea of cc»nputerised concordancing lies in making feasible what people might have 
always wished to do but have avoided doing because of the labour and time involved; and as 
computerised concordancing popularises, more and more people have come to realise its potential and 
subsequently embarked on various concordance-related projects in linguistic research and ESL 
applications. 

The term concordancing is, however, generally used in the literature relating to ESL teaching 
and learning without a very clear definition. It is generally understood to refer to a way of analysing texts. 
Tribble and Jones (1990) describe concordancing as "locating all the occurrences of a particular word and 
listing the contexts" (p. 7), while Levy (1990) defines a concordance as "acollection of all the occurrences 
of a word, each in its own textual environment together with references and word frequencies" (p. 178). 

As computerised concordancing developed, manual concordancing disappeared as a matter of 
course and the term concordancing has become understood to be computer-based rather than manually 
performed, whenever it is used. In a review paper discussing the MSDOS concordancers, Higgins (1991) 
provides the following definition: " A concordance of a word is a set of citations or line references, 
allowing every occurrence of that word within a corpus of text to be retrieved." (p. 92) 

What the computer does in concordancing is to display all the contexts in which a certain word 
or string appears in a text or collection of texts, called a corpus. Software employed to achieve this end 
is thus called concordancing software. Sometimes, computer programs arc referred to as concordance 
generators or concordancers (Tribble and Jones, 1990) and sometimes a concordancing facility may be 
included as one of the functions of a set of programs for text analysis. COMPAID is one example of this 
(see Fang, 1991). Utility programs attached to a computer's operating system, such as FIND. EXE in the 
MS-DOS environment (Higgins, 1991), or home-made macros to be run under more sophisticated word 
processing packages, like WordPerfect and Microsoft WORD, can also serve the purpose. (See Tribble 
& Jones, 1990, pp. 84-89.) 

The ways in which a computer can display the context of a search word or key word may vary 
depending on the software used and the operation selected. The sentence concordance displays the 
sentences in which the search word is used, and paragraph concordance displays the paragraph (Johns, 
1988). KWIC (key-word-in-context) concordances, by far the most widely used among researchers and 
teachers, display the search word in the middle, with as much context as will fit into the line which is 
truncated at either side (Tribble & Jones, 1990; Higgins, 1991). Concordances thus generated by ihe 
computer can be sent either to screen, or to printer as hard copy, or to file for future manipulation. An 
example of a KWIC concordance output is given in Appendix A. 

Largc-corpora concordancing 

The use of concordancing software for text analysis has, for many years, been limited to the 
domain of computational linguistics and corpus linguistics, both being relatively new areas in the study 
of language, made possible by the advent and availability of the computer. These analyses have been 
carried out mainly with mainframe computers on very large corpora running into tens of millions of 
words. The interest in these analyses has stemmed mainly from the desire to provide objective 
descriptions of how the language really works, involving people like lexicographers and dictionary 
compilers. 

Among the major projects, the most well-known include the COBUILD project carried out at 
the University of Birmingham (Sinclair, t987),from which quite a number of dictionaries and reference 
works have been completed and marketed commercially (Sinclair, et al., 1987, 1990). Other examples 
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include the Brown University project on its Corpus of Prese.it-day Edited American English (quoted in 
Yang, 1985) and the Lancaster-Oslo-Bergen (LOB) project at the University of Lancaster (quoted in 
Levy J990), and the JDEST project on English for Science and Technology (Yang, 1985) at the Shanghai 
Jiao long University. 

Sniall-corpora concordancing (SCC) 

Apart from concordancing with large corpora, there has also been a growing interest in the use 
of small corpora analysable with microcomputers. This growing interest coincides with the surge of 
interest in computer-assisted language learning (CALL) and is catalysed by an era when microcomputers 
are becoming more and more accessible to ESL teachers and researchers. 

This interest in small scale corpora concordancing began in the mid-80's,mosi notably with the 
work of Higgins and Johns (1984), and Johns (1986, 1988), which stirred up a movement in SCC. The 
result of the movement is that computerised text analysis has been brought much more closely to 
teachers, course designers, materials developers and learners alike, and SCC as a tool for text analysis 
or as a pedagogic activity is increasingly brought to lest and experimentation in various places all over 
the world where one or more microcomputers are available. 

SCC for syllabus design and evaluation 

As early as 1988, Sinclair and Renouf put forward the idea of designing a general English 
syllabus based on "the common uses of common words" as identified by the computer-generated 
frequency lists of the COBUILD corpus (Sinclair & Renouf, 1988). Using data from the same corpus, 
Willis and Willis (1988) further developed the idea and completed a general English course while Willis 
completed designing his lexis-based syllabus, called the lexical syllabus (Willis, 1990). 

As fai as SCC is concerned, Flowerdew took the lead in its application in syllabus and course 
design. Flowerdew (1991) used concordanced-based word counts to establish the relative importance of 
vocabulary items and provided criteria for syllabus selection and grading. Using a specialist corpora of 
transcription of Biology lectures, he compared the word frequencies with those in the COBUILD general 
corpora and observes some overall similarity and some significant differences. It is argued that these 
observations could form a basis for course design in ESP contexts. Flowerdew suggests that SCC can be 
employed to identify useful items to teach, reveal syntactic patterns in which certain words occur and 
locate functional and notional areas which might he included in a syllabus (Flowerd-w, 1991, pp. 38-39). 

Ma (1993a), in his concordanced-based analysis of the genre of direct mail sales letters, 
discovered the attachment of certain mood and modality to distinct sequenced moves exhibited in his 
50-lciter corpus. Imperatives are found to abound in both the opening and action-getting moves but in 
the former they are never used with the polite marker please. Can. willmd may appear in large numbers 
in the product-description move while mw^r, owg/irro, and ^/lOwW hardly exist. Thematised purpose clauses 
with For or To belong to the overwhelming majority of the action-getting move. It is suggested that these 
observations should contribute valuable references for the design of syllabuses and materials of business 
writing courses where students need to write this kind of sales letter. 

Apart from designing syllabuses, SCC can also be used for evaluating an existing course or 
programme and its materials (Flowerdew, 1991). In Flowerdew*s coipus, connectors Wke then are found 
to appear between the subject and verb, rather then sentence-initial as taught in many published materials 
(p. 38). The defining function is seen to be expressed almost entirely by the word call while commercially 
available materials tend to focus on the word define. Published materials are also found to have 
overlooked the intervening adverbials in many of the passive constructions (p .40).In Ma's corpus, on the 
other hanii, the postscript component in a sales letter, shunned in most published materials as being a 
sign of poor planning, is shown to be the rule rather than the exception (Ma, 1993a). The language of 
refutation, which receives heavy emphasis in an EAP course, is refuted, quite ironically, by Pickard (1992) 
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with reference to corpus evidence. 

sec for the classroom (Classroom concordancing) 

The idea of using SCO in the classroom for the teaching of ESL, generally known as cIas5.room 
concordancing (CC), is strongly supported by a number of researchers and applied linguists (notably 
Stevens, Johns, and Tribble & Jones). 

Why: sec as methodology 

In the ESL classroom, concordancing is seen more as an approach to teaching or learning than 
as a way of text analysis. The rationale for the CC approach is one of authenticity and discovery. Johns 
(1986) describes this concordance-based approach as data-driven learning (DDL). As the name suggests, 
this approach is characterised by language data taking on a primary role in language learning. Johns 
suggests that concordances provide "intake", (after Corder, 1967) i.e. the part of input that is actually 
helpful, to the language learner, which strikes a healthy balance between the "highly-organised, graded 
and idealised language of the typical coursebook" and the "potentially confusing but far richer and more 
revealing authentic communication" (Johns, 1986). 

With regard to authenticity, Stevens (1988) points out the "realism and relevance" that CC can 
offer. While teacher-invented exercises for vocabulary can often contain inadvertently interjected 
artificiality, concordance-based material "assures that contexts will always be real ones" and "relevance 
is achieved when the corpus of text used is appropriate to the language leamers for whom the exercise 
is being prepared". 

Johns (1988) further breaks down the idea of authenticity into three aspects: authenticity of 
script, of purpose and of activity. He believes that in CC, the teacher takes the role of an authentic text 
presenter rather than the traditional text preparer. Authenticity of purpose is achieved by concordancing 
texts that "students are having to work with on their courses or in their research" and authenticity of 
activity is achieved when what is done with the text is transferable to real world situations. 

Levy (1990) thinks that conccrda-ices "present the facts of the language in a precise way "as they 
are based upon "actual usage". Concorda) ice users are thus consulting "the source, the original instances 
of a word's use" rather than trying to peep at its usage via an intermediary, e.g. a dictionary. As Johns 
(1991b) states: "What distinguishes the DDL (CC) approach is the auempt to cut out the middleman 
as far as possible and to give the learner direct access to the data, ..."(p. 30). 

Johns (1991b) also sees CC as an attempt to contextualise and demythologise language. By 
looking at natural language in use, SCC "dispels the myths and distortions that have arisen from reliance 
on ^armchair* linguistics" and it also dispels the need for the language teacher to answer learners' queries 
by resorting to intuition alone. 

As far as discovery is concerned, Johns (1988) points out that SCC is in line with the assumption 
that effective language learning is a form of linguistic research. He believes that the teacher is potentially 
most effective when he or she is most at risk, and thus when the teacher is placed alongside the learners 
in attempting to solve communication problems, made possible by concordancing subject-related texts, 
the teacher is then able to gain valuable insights which might be otherwise inaccessible (Johns, 1988). 

In relation to the concepts of authenticity and discovery, Tribble and Jones (1990) point out that 
the real value of concordancing lies in the question of visibility .Concordancing software enables the user 
to visualise text features in ways that have never been possible. Tribble (1990a) descriues the use of CC 
as "making the invisible visible" and he comments that CC is a "very new approach to the very old task 
of teaching and learning a language". Taking this visibility dimension of concordancing further, RundcU 
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and Stock (1992) remark, "Perhaps the single most striking thing about corpus evidence ...is the 
inescapability of the information it presents." Though Rundell and Stock are speaking from a 
lexicographer's point of view, their comments are certainly applicable to learners using the concordancer 
as a language learning tool since learners also assume a role very similar to that of a linguistic researcher. 

On the learner's road to discovery, the role of the computer and the concordancer is described 
as a special type of informant, givingthe learner access to linguistic data (Johns, 1991a). Johns (1991a) 
describes this approach as a break away from the rule-based approach into the data-driven approach and 
identifies it as a kind of inductive learning where it differs from the traditional approach in that data 
replaces the teacher as the basis. It is believed that the CC approach can build learners' competence by 
giving them access to the actuality of linguistic performance. 

How: getting a concordancer to work 

There are two prerequisites for classroom concordancing. First, there must be the computer 
hardware and software which operate the concordancing and sec^^.d, there must be a corpus for the 
computer to work on. 

Software selection, Tribble and Jones (1990) make a distinction between three different types 
of concordancing software: streaming concordancers, text-indexing software and in-memory text 
consulters. Streaming concordancers read a text one line after another and produce concordances as they 
work through the texts. Text-indexers are those that create an index of the text in one operation and then 
allow for different types of text retrieval activities, including concordancing. One example of these is 
WordCruncher. The last type, in-memory text concordancers, reads the whole text into the computer's 
working memory and then operates on it to show different types of information as desired by the user. 
Longman Mini-concordancer is an example (Tribble and Jones 1990, p. 13). 

Tribble and Jones (19^0) recommend, though rather implicitly, using in-memory concordancers 
for classroom concordancing. They point out that this type of software is limited by the memory size of 
the computer but has the advantage of a variety of text-handling capabilities once a file, or set of files, 
has been loaded. Streaming concordancers are seen as too slow to justify classroom applications while 
text-indexers are viewed as too sophisticated and should be left only to large-scale researchers (p. 14). 

Higgins (1991), in his review of MSDOS concordancers, makes a distinction between three types 
of concordancers: dedicated research concordancers, dedicated classroom concordancers, and text utilities. 
Dedicated classroom concordancers are characterized by their "rapid results and clear displays", and are 
what he thinks to be appropriate tools for the ESL teacher in the classroom. 

Corpus creation. While most people talk about concordancing with a corpus of some kind, it 
is worth pointing out that concordancing can actually bfe done with individual texts. Tribble and Jones 
(1990) point out that individual texts could be the target for concordancing if the objective is to analyse 
the language of that text (p. 15). 

In corpus creation, a distinction is generally made between a general corpus and a specialist 
corpus, the choice depending obviously on the needs of the learners. Tribble and Jones (1990) specify 
the following criteria for the creation of a general corpus for classroom use: 

1. Use authentic, natural language 

2. Use contemporary texts 

3. Exclude archaic forms 

4. Exclude dialect 

5. Stick to prose 

6. Exclude technical material (p. 18). 
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To create a suitable corpus for the general English classroom, Tribble and Jones (1990) describe 
the following possible methods. Similar methods are also advocated by Sinclair (1991), who works mainly 
with mainframes. 



1 . Keyboarding 

2. Optical scanning 

3. Adaptation from ready-made text files, including word-processed documents, READ. ME files 
accompanying software packages, sources of text by access to a network or to colleagues (Tribble and 
Jones, 1990, pp. 19-21). 

As optical scanning facilities become more and more popular, with prices of high-technology 
products falling all the time, it can be expected that more and more people will take advantage of this 
convenient means of input for corpus creation, both in research and classroom applications, rather than 
relying on manual keyboarding. It must be pointed out, however, that the margin of error with most 
optical scanning hardware and software today is still disappointingly large, which makes them less than 
an ideal means of input. Any heavy reliance on machine-read operations must be offset by a sufficiently 
large corpus to make the database a useful and dependable one. 

Tribble and Jones (1990), advising on teacher-created corpora for classroom concordancing, 
suggest accumulating a number of specialist corpora to form a general corpus. While seeing this as an 
easier job than trying to assemble a large general corpus at one time, they point out that accumulation 
in this way also addresses the need to achieve "balance and variety" in a general corpus, (p. 16) though 
one might wonder how this could avoid including technical material, one of the principles Tribble and 
Jones (1990) put forward for general corpus creation. 

Corpora size. It is said that "small corpora can play a subsidiary role in investigating specialised 
varieties of texts that are neglected in large corpora or where the classification systems of the large 
corpora are insufficiently delicate to recover the information required" (Johns 1986,p.l58).But how small 
should a small corpus be? According to Tribble and Jones (1990), it appears as a general mle that, even 
working with small corpora, a bigger corpus gives richer, more interesting and more representative 
information while too small a corpus may result in distortion (pp. 15-16). 

Tribble and Jones suggest that a corpus of 50,000 words should be very useful for classroom 
purposes (p. 14). The corpus Tribble and Jones used in their experimentation, the ELT Text Pack Corpus, 
consists of texts from both written and spoken English running into 45, 000 words, which is not as large 
as one might have imagined necessary. The rationale behind this 50,000 wcrd threshold is unclear, but 
a study of the size of the corpora used by some of the researchers mentioned in this paper, as given in 
Table 1 , will give a rough idea of how small small corpora generally are, noting that some of them are 
not meant for classroom use. 
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Table 1. SCC in ESL: Corpora 



Researcher 


Corpus 


Number of words 


King, 1989 


Academic lectures and tutorials 


155,000 


King, 1989 


Scientific & technical journals 


11,400 


Trible & Jones, 1990 


ELT Text Pack Corpus 


45,000 


TriWe, 1990a & 91 


English Historical Review Corpus 


104,555 


Trible, 1990a& 91 


Longman Corpus of Learners' 
English 


54 861 


Mpamtsa et al., 1991 


Economics corpus 


20,749 


Mparutsa et al.. 1991 


Geology corpus 


ART 


Mparutsa et al.. 1991 


Philosophy corpus 


6,854 


Johns, 1988 


Transportation & highway 
engineering corjius 


100.000 


Johns, 1988 


Plant biology corpus 


100,000 


Johns, 1991a 


New Scientist Corpus 


760,0(M) 


Johns, i99Ib 


Byte Corpus 


^ 1 r\(\(\ /\/\/\ 

> 1 ,(a)U,(KK) 


Johns, 1991b 


Corpus of academic papers 


250,(K}0 


Rousse!, 1991 


New Scientist Corpus 


760,000 




Bioionv lecture corpus 


104,483 


Pickard, 1992 


Applied linguistics papers 


>50,(K)0 


Ma, 1993a 


Direct mail sales letters corpus 


16,345 


Ma, 1993b 


Computer software user manuals 


52,0(K) 



Corpora type. While general corpora are thought by many (e,g, Tribble and Jones, 1990) to be useful 
for ESL, specialist corpora with ESP texts certainly address the needs of a particular group of learners 
with "relevance" (Stevens, 1988) and have a definite value in ESP settings. Levy (1990) says: 

Concordances drawn from a specific subject area (e,g. scientific texts), a specific mode 
(c,g. journalism) or a specific medium (e,g, spoken language) can provide very helpful 
data on the range of words and their particular patterns of usage within a given context 
or genre (p,l79), 

Tribble (1991) demonstrated the ncjd to achieve whac he called "face validity" in the use of 
corpora. With an analysis of speech-related verbs in one learner corpus and three different specialised 
native speaker corpora, Tribble demonstrated the need to use "coipus resources appropriate to the 
domain with which the students were already familiar" as different corpora, apart from showing up 
different words, are shown to have sets of words used in dramatically and interestingly different ways. 

Although most existing corpora are collections of well-formed authentic native speaker texts, 
there is also value in assembling a specialised corpus of ESL learner texts, Johns (1986) suggests that 
concordancing with learner texts provides an excellent tool for examining recurrent patterns of errors or 
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successes, and also for studying "the ways in which they manage to avoid syntactic and lexical problems 
in the target language" (p.l59). Tribble and Jones (1990) takes a step further to suggest that corpora of 
learner texts, besides helping to identify and analyse learners' problem areas in lexis, grammar and 
semantics, could shed light on how the native language influences the way English is learnt as a second 
language. 

In fact. King (1989) used a corpus of learner texts from his students studying English for Science 
and Engineering to compare with a corpus from professional scientific and technical journals and was 
able to observe the differences in the use of sub-technical vocabulary and to point out implications for 
teachui^. So the use of a student corpus can have its value in informing the teacher and in helping to 
devise strategics for teaching before the teacher enters the classroom. 

What: potentials and applications 

The *what' of SCC includes what can be done and what has been done with SCC. 

What can be done A lot has been said about what can be done with classroom concordancing. Johns 
(1988) suggests the following six main uses: 

1. CC can be used as "a resource for small scale on-the-ground research by the teacher in order to 
inform teaching decisions". 

2. The teacher can use concordance output to prepare teaching materials. 

3. Tlie teacher can incorporate concordance output directly in teaching materials and "devise activities 
that get students to puzzle things out for themselves". 

4. Concordances can be used for "serendipity learning", which is the kind of free-ranging and open-ended 
linguistic enquiry made possible by the rich information concordances provide. 

5. Concordancing can be used interactively as a focus of classroom activity. 

6. The concordancer can be used as "a sleeping resource", offering help when the need arises. 

Tribble and Jones (1990) summarise their suggestions of uses of concordancing in the following ways. 

A. Using concordance outputs for: 

1 . deducing the meaning of keyword from context 

2. study of grammatical features of particular words and of general grammatical features 

3 . study of homonyms and synonyms 

4. group work activities 

5. gapfill exercises 

6. matching exercises 

7. remedial exercises based on learners 'own writing (p.55) 

B. Interactive uses: 

1 . learning about grammar 

2. vocabulary development 

3. English for specific purposes 

Levy (1990) strongly recommends the use of CC for the teaching of collocations, which he views 
as one of the most frustrating features of the language for students and teachers at higher levels. He 
believes that "a set of examples as given in a concordance would give the students the correct sense of 
how a word is used" (p. 178). 

Levy (1990) also suggests using on-line concordancing, and integrating it with a word-processor 
to give a fully integrated word processing environment. Concordances are seen as an explanatory device. 



18 

9 



HONGKONG PAPERS IN UNGUIWCS AND UNGUAGE TEACHING 16 (1993) 

useful for learners using the computer as an electronic writing tocU When a concordancer is integrated 
with a word processor with a full dictionary and a thesaurus, the entire system will serve to answer a 
student's query about a word or phrase better than a dictionary, a concordance or a thesaurus alone. 

The concordance contributes in the following activities when used in combination with the 
dictionary or thesaurus. 

1 . checking meanings 

2. checking general syntax 

3. checking usage 

4. exploring special lexis especially ESP vocabulary 

5. checking derived forms 

6. checking collocates of words 

7. exploring set pieces, e.g. phrasal verbs, cliche's 

Figure 1 shows a diagrammatic representation of Levy's idea of an ideal electronic writing 
environment. 



User 



Word Processor 



Resources 



Spelling Checker 



Style Checker 



Text corpora 



Main Dictionary 
(definitions) 



Thesaurus 



Bilingual 
Dictionaries 



Figure 1 . Concordance rs and word-processing for language learners (From Levy, 1990) 

For concordances to be useful. Levy (1990) contends that flexible selection mechanisms are 
necessary. Students need to be acquainted with the search and retrieval techniques used in concordancing 
software. He further suggests that the success of any concordance program depends on "flexible and 
efficient user interface" as well as the "quality and relevance" of the text corpora. 

Levy suggests concordancing with 

1. adjacent words ordered alphabetically, 

2. common words, and 

3. small specific corpora. 

According to Levy, the teacher will need to have at his disposal all the large and small, general 
and specific, corpora in order that students can refer to the most appropriate corpus of text for a relevant 
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use of concordances. But one cannot help wondering whether it is practically feasible, and worthwhile, 
to do so. 

More concerned with how concordancing caii be carried out to benefit learning. Honey field 
(1989) develops a typology of exercises based on concordance-based material and suggests a four-step 
procedure for concordance-based teaching activities, as follows: 

1. The stv.df.nt becomes aware of a need for data, for information about how the language is used. Such 
awareness may arise from a more communicative task, such as writing a report, or from a more 
language-oriented exercise, e.g. a vocabulary or grammar exercise. 

2. The student consults relevant concordance material, either through direct access to a computer or by 
using concordance material supplied by the teacher. 

3. The student analyses the data and draws conclusions. 

4. The student applies the insights gained to the task in Step 1. (p. 44) 

Flowerdew (1992) suggests a process approach to the teaching of professional genres and believes 
that concordancing has a role to play in helping students discover specific features of a genre or compare 
features of two genres. 

Going beyond ESL learners. Berry (1993) suggests using concordance printouts to help language 
teacher trainees to increase their awareness of the language, the rationale for which in fact does not differ 
very much from that applied to ESL learners at an advanced level. 

What has been done 

Experimentation with CC has been reported by quite a number of researchers, though what h?s 
been reported may represent only the tip of the iceberg. 

Apart from being seen as an approach in teaching and learning, SCC is also seen a. a pedagogic 
activity. Stevens (1990) sees concordancing as a form of text manipulation activity, which can be seen as 
parallel to other forms of text manipulation such as text reconstruction activities with jun:^bled sentences 
or paragraphs. Taking it a step further, some ESL teachers take SCC as a type of less, n, which could 
parallel listening sessions or writing workshops, and SCC in the classroom has thus been called 
concordancing sessions (e.g. in Mparutsa et al., 1991). 

What has been reported in the current literature about CC applications falls into either 
pre-classroom or classroom use. Pre-classroom use of CC refers to the transformation of concordance 
outputs into teaching materials in the form of either overhead transparencies (OHTs) or paper-based 
classroom tasks or exercises. Classroom applications of CC, on the other hand, represent the interactive 
use of concordan.':ing, sometimes also called on-line concordancing (Levy, 1990). This is where either the 
teacher directs the learners to generate concordances for discovery-type study of language features or 
language use, or learners are allowed self-access to the corpora for carrying out student-initiated linguistic 
enquiry and research. 

Grammar and vocabulary teaching. Most reported work relating to CC in an ESL setting is concerned 
with the teaching and learning of grammar and vocabulary. Table 2 gives an overview of the kind of work 
reported. 
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Table 2. Summar}' of CC applicaticns in ESL 





Year 


Students 




l^alUIC UI 0^{JUC«lvlOn 


Johns 


1988 


Post*experience \lSc 
students 


'fo'a^ infinifivP nr 

K\/ AO llllllltilTW \Jl 

preposition, differentiating 
'therefore' and *hence\ihe 
use of articles 


interactive concordancing 


1 riuuic 


1990a 


AHvflnrpH QtiiHpnt^ in 

EAP programme 


1 Iqp of r4~-»niKitinnQ 5)nH 

wow ^^1 Ll» >>'.'\Jolli\.illd allU 

anicles 


I Ivilll LfuL^&l I'aawU waCI^IoCo 

to interactiw: concordancing 


Isle 


1991 


Students studing 

lilVCoililg dllU iUlailUlilJ^ 111 

a vocational training 
programme 


Subject-specific 


Interactive searching and 

NCICUilll{^ aLU> lij 


Johns 


1991a 


Postgraduate research 
students 


Comparing 'convince' & 

*npr<;iiflHt*' thp ii^p of 

'should' 


Paper-based exercises 


Johns 


I99Ib 


Postgraduate research 
students 


The use of should', 
'recommend', ihat- 

*have to' 


Paper-based exercises 


^4n^^nItc^> Pt 

• al 


I99I 


from a teacher- 
dominated rule-based 
learning system 


CiiKippf^criprifir voraHiiIarv 


I ntf*rapf 1 VP f^onpOrcia npino 


Stevens 


1 99 1 a 


First-^ear science 
undergraduate 


Subject-specific vocabulary 


Interactive concordancing 


Stevens 


I99Ib 


Undergraduates 


Subject-specific vocabulary 


lnteracii\e concordancing 


Taylor 


I99I 


Learner teachers 


Grammar in general 


Edited concordance outputs 
presented on OHTs 


Ma 


1993b 


Third-year students in a 
higher diploma course in 
computing 


Use of the corpus to aid 
writing part of a software 
user manual 


Interactive concordancing 



As most discussion centers around the teaching of vocabulary, whether general or ESP, it is 
worth pointing out that Stevens (1988; 1991a; 1991b) puts forward a strong case for the teaching of 
vocabulary with classroom concordancing. Stevens (1991b) suggests selecting "the most revealing contexts 
for the same v/ord'* from concordance outputs for making gap-filling exercises with multiple contexts, 
which is argued to reduce the chances of error and increase student confidence and improve performance 
(p. 38). After students are familiar with how concordances can be generated, they can be directed to 
self-access vfjcabulary study by running what Stevens calls "exploratory concordances". An example of a 
concordanced-based gap-filling exercise, taken from Stevens (1991b), is shown in Appendix B. 

An empirical study (Stevens, 1991a) comparing the traditional gap-fillers and the KWIC 
concordance-generated ones draws the conclusion that the latter can be seen as a viable alternative to 
the former. The pedagogical value of traditional gap-fiU vocabulary exercises is questioned as an incorrect 
choice of word at the beginning could "compound the error" by taking away yet another contextual clue 
which might be needed for further decoding of the text. It is argued that, though not neccs .arily superior, 
concordance-based gap-fillers are more easily solved provided that students are given a brief 
familiarisation phase. Stevens claims that "the truncated demi-context typical of concordance output does 
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not seem to be a hindrance to the their discerning the word missing from the contexts" and that the 
"multiple of disjunct contexts helps them more in settling on a correct word than do the clues inherent 
in a passage of discourse with the same words missing" (p. 55). 

Cross-Linguistic Parallel Concordances. Although CC does not seem at first sight to have any face 
validity for the teaching of pronunciation, Roussel (1991) advocates the use of Cross-Linguistic Parallel 
Concordances (CLPCs) for teaching tonic placement. Roussel carried out a study on transcribed speech 
of English and French and finds that CPLCs could be of help in teaching tonic placement related to 
auxiliary verbs in English. Roussel's experiment, using her own intuition about tonic placement with a 
largely written corpus, may be at fault. But the use of CPLCs-based exercises could indeed help heighten 
learners' awareness of the difference in the two languages they speak. And the opportunities for CLPCs 
to be used in the classroom for comparing two languages are no doubt open for more research and 
investigation, though for many pairs of languages, like English and Chinese, parallel concordancing is still 
far from being technically possible. 

Impact of classroom concordancing 

The impact of the CC approach is perhaps best summarised by Johns (1991a), who reports 
having used concordances in his teaching for four years with overseas postgraduate students. Johns claims 
that CC could have an impact on the process of learning, the role of the teacher and the place of 
urammar in ESL teaching. While Johns' first claim is supported by a number of practical applications 
ofCC, his second and third claims remain unexplored and open to further research. 

Johns claims that "concordances stimulate enquiry and speculation on the part of the learner", 
and help the learner "to develop the ability to see patterning in the target language and to form 
generalisations to account for that patterning." (p.2) He reports that by using interactive concordancing, 
his learners were able to provide more valid answers than the teacher could provide intuitively (Johns, 
1991a). 

This claim of Johns is supported by a number of researchers. Mparutsa et al. (1991) found that 
concordancing could help "develop students' learning skills with written text" as well as "promote 
independent and group learning". They also report changes in students' attitudes from the acceptance of 
the textbook as the supreme authority to having a more interactive and inquisitive approach to learning. 

Taking it a step further, Taylor (1991) reports high transferability of discovery learning from 
concordance-based lessons when students showed better performance in subsequent text evaluation tasks. 
And Mparuisa et al. report cases where the student was seen to "contribute his/her developing subject 
knowledge" and the teacher could "contribute knowledge of language functions", leading to an 
understanding of the text through joint-discovery (p. 131). 

In addition, a number of other researchers (e.g. Butler, 1991;Isle, 1991; Mparutsa et al., 1991) 
report boosted motivation with the new approach. Isle (1991) points out: 

The motivation is undoubtedly there: my students found the concordance program a 
fascinating piece of software and appreciated its potential for investigating and extracting 
infcmation whether on facts and figures or linguistic questions (p. 107). 

As regards the impact on the teacher, Johns concludes that the teacher's role is to have 
undergone a healthy change from the traditional roles to "adirector and coordinator of student-initiated 
research". Syllabuses, teacher's key books and many traditional practices have to give way to the natural 
data of language and this role is a challenging one as there are a lot of new questions that remain to be 
answered . 
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And the third major impact observed by Johns is that the CC approach makes possible "a new 
style of grammatical consciousness-raising by placing the learner's own discovery of grammar at the 
centre of language learning". Johns theorises that "when grammatical description is the product of the 
learner's own engagement with evidence, that description may show a far greater degree of abstraction 
and subtlety than with a given description" and as a result the place of grammar in the ESL classroom 
has to be entirely re-evaluated. 

sec for test design 

So far, references made to the use of SCC by teachers and researchers lie in a teaching-related 
context. Butler (1991) is perhaps the first person to use SCC in an ESL testing environment. Butler used 
SCC for test construction. He argues that SCC could improve the very popular cloze test in that the bias 
of the text content of just a single piece of text '^cuM be eliminated by concordance-generated tests of 
the gap-filling type where a test item appears in a se, of different sentences drawn from a number of 
different texts in a corpus. 

Drawing on Oiler's (1979) idea that a cloze test "deals with contextually interrelated series of 
blanks", Butler (1991) believes that sentence concordance outputs can be easily manipulated, with the use 
of word-processing software, to provide computer-enhanced cloze tests which, though not providing a 
complete discourse, meet Oiler's criterion for a cloze test. What Butler did was to run a concordancer 
through a corpus and had it generate sentence concordances of certain selected words. The role of the 
test designer changes from that of selecting and/or modification of a text to selecting the test words and 
the appropriate citations. An example of the test Butler used is given in Appendix C. 

Of course much of Butler's argument lies in whether one is convinced that a collection of 
sentence concordance outputs as such can be viewed as the equivalent of a "contextually interrelated 
series of blanks" suitable for the design of cloze tests and also whether one a[ proves of the test being 
constructed without a complete discourse. The criteria for word selection which Butler used in his 
experiment remain unclear, and, although there was positive feedback from students (p. 34), it remains 
doubtful whether the test so constructed was a valid and reliable one. 

Anyway, Butler's reservations about the use of the current CBELT (computer-based Rnglish 
language testing) software programs, based on random deletion (Butler, 1991, p. 33), are perhaps sensibly 
cast. The use of concordances and a corpus, supported by the expertise of the user, the teacher or the 
test designer, is obviously superior to just leaving the job to the machine and the extra time they spent 
could also be well justified. 

Conclusion 

As described above, SCC has been looked at with enthusiasm by most who believe in the use 
of authentic materials in second language teaching. Interestingly enough, even people who believe in 
having to exercise great control over educational texts may view concordancing positively and believe that 
concordancing with authentic texts can have a role to play. Foulds (1991) , for example, points out the 
value of concordancers in "monitoring and adjusting linguistic features" in pedagogic texts, (pp.47-53) 

As with any application of new technology in the classroom or in research, both the researcher 
and the students are likely to get excited with it at the beginning. Whether it is going to stay there as a 
useful pedagogical tool will be subject to serious experimentation in different situations, using students 
of different backgrounds and levels. 

SCC has stirred, and will no doubt continue to stir, a wave of excitement in the field of ESL 
teaching as more and more teachers try out SCC in their classrooms. SCC is now only in its infancy and 
it has been enthusiastically promoted by a number of people, especially Johns, Stevens, and Tribble. 
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However, not many of its applications are in fact revealed in the literature and not much of the 
learner feedback has been seriously examined. Most work on CC tends to slant towards the speculative 
rather than the evaluative end. Descriptions about learner responses tend to be observation-based rather 
than empirically studied. The influence of CC on the teacher and on the place of grammar has hardly 
been investigated. It remains doubtful whether teachers and learners can cope with the inherent technical 
problems of concordancing such as hardware operation, search techniques and output procedures so as 
to make concordancing sessions effective and worthwhile, without the lessons being turned into desperate 
attempts to get the hardware, software and database in the right place and the machines to work in the 
right way. 

In particular, not many of the dangers of CC are ever cited, though obviously as work associated 
with a new technological tool, it could not be without any pitfalls. In corpus creation, for example, bias 
could be one, where owing to the inherent convenience of inputting texts in the wriuen medium, the 
spoken aspect of the language could be easily neglected and this could result in learners having an 
unbalanced picture of the language. Overdependence on machine-read type of text input and .nisuse of 
corpus creation criteria could well be other potential sources of danger. 

So far, applications of CC seem to have been limited to students at the very advanced level and 
to the teaching of grammar and vocabulary. Much has still to be learnt about how it can be employed 
with students of a level much lower than those cited in the current literature, say with secondary school 
or primary school students. The value of CC in the teaching of macro aspects of the language, such as 
discourse level features, also remains unexplored. It should be obvious that CC cannot be the entirety 
of any ESL course and so the question remains as to how it can be integrated with other areas of a 
course so that CC can become most fruitful and rewarding. Materials developed from CC are not yet 
seen marketed for use by ESL population teachers (Johns is preparing to do this; see Johns, 1991a) and 
there is obviously a long road to drive before CC-conscious researchers will see CC popularized. 

Other areas in SCC, like CLPCs, test design and the teaching of segmental or prosodic features 
in pronunciation are virtually virgin lands open for exploration and what SCC has in store for ESL is still 
waiting for teachers and researchers alike to put in more effort if the fruits of the technology are to be 
reaped for yet greater abundance. 
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Appendix A Example of KWIC concordance output 

Concordance for "please" 
Text: SOI (&c) 

501 134 em. Reply now. for more information, please phone 529 7171. 

502 97 meantime, if you have any questions, please call the Insurance Company of North 

504 54 er, or settle the balance in fiill. Please take a few moments to look through t 

505 78 orry - your statement is on the way. Please take special note of the total and s 

506 91 dition Picasso! For faster ordering, please call our Merchandise Services on 885 
S08 58 some of our other smart money ideas, please check the appropriate box(es) on the 

508 76 to use if you have any questions. Please don't hesitate to use it: 810 882 

509 11 36/F., 88 Queens way. Hong Kong. Please enrol before February 15, 1991 Gu 
S09 106 e Plan supports you and your family. Please spend a few moments reading it, then 

509 135 ends February 15, 1991. For enquir)', please call Insurance Company of North Amer 

510 15 Dear Preferred Customer, Please remember to take advantage of your P 

510 103 ion, call our hotline on 886 4234. Please act now - a tax cheque reserved unde 

511 77 while, if you need more information, please feel free to call ilie Carlingford Ho 

512 77 your Shui Hing Card to spend as you please; or beautiful Estee Lauder lipsticks 

513 40 xecutive" magazine How to apply Please submit all the information including 
S13 40 ai, Hong Kong. In case of enquiries, please call our 24-hoiir Customer Services H 
S13 40 3. For additional application forms, please drop by any of our conveniently loca 

515 49 u receive one month's FREE coverage. Please act today. 

516 75 e, if you have additional questions, please don't hesitate to call the AIA Hotli 

517 67 made by Cardmembers throughout 1991. Please see enclosed leaflet for details. 

518 37 s accordingly. For more information, please call our Customer Service Center at 

520 105 special arrangement. For enquiries, please contact the Insurance Company of Nor 

521 37 are limited. For general enquiries, please call our Customer Service Unit at 74 

521 38 49; for product or delivery details, please call the respective advertisers' hot 

522 66 ing, a portion, or nothing at all. Please feel free to phone the Diners Club C 

523 24 you a few weeks ago? If you haven't, please reply today. As it is a convenient a 
S23 68 under the HK$500,000Key Protector, please send in your Confirmation Form now. 

523 87 ediately. If you have any questions, please call the Insurance Company of North 

524 62 Cardholders. For general enquires, please call our Customer Service Unit at 
S24 66 for product and delivery details, please call Labonda Ltd. at 541 6689. 

526 58 ply for an even higher credit limit. Please call our Telephone Service Center at 

527 21 you a few weeks ago? If you haven't, please reply today. It was designed exclusi 
S27 126 General Manager P.S. Don't delay. Please complete and return the enrollment f 
S27 127 rm today. If you have any questions, please call Insurance Company of North Amer 
S31 92 nditions apply to the above offer, please call Club Med for details on 521 1 

533 43 already sent in your payment. If so, please excuse this letter. We are concern 

534 82 icm is needed. For more information, please call the Carlingford Hot Line at 827 

535 24 ected for your personal enjoyment. Please browse through this brochure to sele 
S35 44 Cardholders. For general enquiries, please call our Customer Service Unit at 74 
S35 46 9: for enquiries on the gold stamps, please contact International Collections Lt 

535 47 other product and delivery details, please contact Labonda Ltd. at 541 6689 

536 65 gaiion. If you have any questions, please call the Insurance Company of North 
S40 42 Christmas. For general enquiries, please call our Customer Service Unit at 74 

540 43 49. For product or delivery details, please call the advertisers' hotline number 

541 32 houkan's most valuable contest ever. Please check the enclosed brochure for deta 
S41 51 Cardholders. For general enquiries, please call our Customer Service Unit at 74 
S41 52 t at 748 4949. For magazine details, please call Asiaweek Limited at 563 6102. T 
S47 42 arch 1992. For general enquiries, please call our Customer Service Unit at 74 
S47 43 49. For product or delivery details, please call the advertisers' hotlines liste 

549 35 eanwhile, if you have any questions, please don't hesitate to call our 24-hour C 

550 40 rrencies! If you have any questions, please call our CitiPlus Hotline ai 861 151 

(From Ma, 1993) 
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Appendix B Example of concordance -based classroom exercise 

Below you find the result of a "concordance" ir.ade on some of these words. 
In this concordance, a computer looked at all the readings in the first -year 
biology workbook. Then the computer printed each line containing those 
words. (The computer doesn't know where words or sentences begin or end; it 
just prints the line.) 

DIRECTIONS: Replace each BLOCK of blank spaces below with ONE WORD from the 
word list above. 



la 
b 
c 



make up the taxonomic 
one progresses down the 
At the bottom of the 



, the number of organis 
the differences within 



2a s a longitudinal layer 

b to form one cord, which 

c single large taproot 

d which the root hairs 



ing the length of each segme 
s along the length of the 
deep into the soil with oth 
Inside the epidermis is 



3a Numerous granules are 
b ividual cells, firmly 
c , by which muscles are 



to the matrix side of the 
to each other, rest on a 
to bones, are composed 



4a 
b 
c 



to capture prey or to 
airs of chaetae. They 
ecause roots are the 



the organism in place, 
each segment in the soil, 
ing and absorbing organs of t 



5a 
b 
c 



tractile vacuole removes 
The epidermis prevents 
ncreases the chances of 



water from the cytoplasm of 
ive water loss and yet al 
ive water-loss but this is pr 



6a 
b 
c 
d 



In 

ng to the cells. In 
Organisms which have 
e, the mouse develops 



parts of the cell the ER i 
animals, cilia covering th 
basic features in common ar 
symptoms and dies. Howe 



7a ilia sweep food into an 
b side of the cell. The 



groove on the side of the eel 
groove leads to the cytophar 



8a 
b 
c 



he science of biological 
is the largest unit of 
The various units of 



IS 



known as taxono 
It is split into 
kingdom, phylum. 



(From Stevens, 1991b) 
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Appendix C Example of concordance-based cloze test 

Each of the sentences below has the same word missing. 
Fill in the blank with the correct word. 



A. 

1 . Fortunately we have large amounts of exploitable potential on which to capitalize. 

2. There is no question, however, that food production will have to be raised higher to help feed the 

world's growing population. 

3. This does not solve the problem. 

4. Here's hoping you're in your old flat by the time this letter reaches you. 

B. 

1 . Such an approach is usually the of choice for buying the best car. 

2. I had to live with this for nearly two years. 

3. This is not the ideal for a student to check his or her progress. 

4. This is a common j^ven though many people fail to appreciate that such analysis represents an 

integral part of the process. 

C. 

1. It is a list of connected with ever>'day work in an English Secondary School. 

2. One of the first tiiat I did was to settle back into the leather armchair of my study. 

3. As you may imagine, I had rather different ideas on how should be done. 

4. This would have the advantage of making much simpler in terms of presentation. 

D. 

1. In this case, more than 50 years passed between the initiation of the original research and the .when 

production was significantly increased. 

2. Perhaps we could meet next week, when you have 

3. 1 hope that this answers any outstanding questions for the being. 

4. However, what is less well-known is that over the same period the Government has been training 

more and more teachers. 

E, 

1 . They occur at the same time and 

2. Brass and copper and other metals are all put into three different boxes, but they all end up in the same 

3. Vm selling this as soon as possible, and moving to London. 

4. She is intending to study Chemistry at a British University, but needs an acceptable grade to gain a 

F. 

1. In 1967, my colleagues and 1 began attempting to pictures of individual genes. 

2. In spite of the difficulties, attempts to such transfers of information are worthwhile. 

3. You may of this what you want. 

4. 1 should like to the following alterations. 
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G. 

1. This discovery created excitement among many scientists and nutritionists. 

2. The progress towards complete re-cycling has been slow, but has made ground over the past 10 

years. 

3. These students often had problems adjusting to life in England. 

4. However, statistics indicate that the company is undergoing a decline. 

H. 

1. These varieties better characteristics and earlier maturity. 

2. We not yet determined the minimum lengths of these segments. 

3. The post offers work in the three areas in which I most experience and interest. 

4. CFC's (or chlorofluorocarbons) .become notorious in recent years. 

I. 

1. Another example of waste disposal are the heating systems used m modem apartment blocks. 

2. Thus, all types of refuse, except that which goes through the pulverizer, is in .way re-graded and then 

re-cycled. 

3. At this stage ^relatively sophisticated task might be expected. 

4. Also, with very young children techniques are probably not suitable. 

J. 

1. Insects, when faced with extinction, mutate new races capable of attacking other varieties. 

2. The change probably took place in a farmer's field somewhere in Western Iran about 5, 000 years ago, 
when cultivated wheat was brought the area of a wild one. 

3. Why is the waste being sorted different types? 

4. Special techniques are therefore necessar to introduce desirable material from these wild species 

the cultivated areas. 

K. 

1 . However, experts have said each of these fuel resources will be used up by approximately the year 

2020. 

2. All the rubbish will bum is burnt. 

3. Textiles are sent down a second chute, and then undergo a process similar to of paper and card. 

4. Take the book you have obtained from your College, University or local Public Library. 

L While Darwin's book immediately generated a great deal of 4iscussion and controversy. Mendel s 

discovery was largely ignored at first. 
2. It stimulated little for 25 years. 

3. 1 also enclose a photograph of Liverpool, should this be of to you. 

4. I have a continued iji the current range of new products. 

M. 

L She has clarified the role she wishes to take the work ever be commissioned. 

2. It went from bad to worse after she decided that things be run her way. 

3. Despite her capabilities she never exceeded her responsibilities, and always referred to myself when there 
was any doubt as to which course of action Jiave been taken. 

4. 1 think that the textbook .have a different subject in each section. 

Key 

A. STILL H. HAVE 

B. METHOD I- SOME 

C. THINGS J- INTO 

D. TIME K. THAT 

E PLACE L. INTEREST 

F^MAKE M. SHOULD 

G. CONSIDERABLE 

(From Butler, 1991) 20 
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